915 research outputs found
Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization
Fast and effective automated indexing is critical for search and personalized
services. Key phrases that consist of one or more words and represent the main
concepts of the document are often used for the purpose of indexing. In this
paper, we investigate the use of additional semantic features and
pre-processing steps to improve automatic key phrase extraction. These features
include the use of signal words and freebase categories. Some of these features
lead to significant improvements in the accuracy of the results. We also
experimented with 2 forms of document pre-processing that we call light
filtering and co-reference normalization. Light filtering removes sentences
from the document, which are judged peripheral to its main content.
Co-reference normalization unifies several written forms of the same named
entity into a unique form. We also needed a "Gold Standard" - a set of labeled
documents for training and evaluation. While the subjective nature of key
phrase selection precludes a true "Gold Standard", we used Amazon's Mechanical
Turk service to obtain a useful approximation. Our data indicates that the
biggest improvements in performance were due to shallow semantic features, news
categories, and rhetorical signals (nDCG 78.47% vs. 68.93%). The inclusion of
deeper semantic features such as Freebase sub-categories was not beneficial by
itself, but in combination with pre-processing, did cause slight improvements
in the nDCG scores.Comment: In 8th International Conference on Language Resources and Evaluation
(LREC 2012
Modelação da Distribuição Espacial Potencial do Elefante Africano (Loxodonta africana, Blumenbach) na Reserva Especial de Maputo, Moçambique
V Conferência Nacional de Cartografia e Geodesia, Centro de Congressos do Laboratório Nacional de Engenharia Civil, Lisboa, Portugal, 19 a 20 de Abril de 2007.Nesta comunicação apresenta-se uma abordagem ao problema da modelação da distribuição espacial de espécies selvagens, aplicada ao caso específico do elefante africano. O objectivo é obter empiricamente um mapa da distribuição potencial do Elefante africano na Reserva Especial de Maputo (REM); integrando para esse fim métodos estatísticos e modelação geográfica. A referida distribuição é analisada tendo em conta as duas estações tipicamente influentes nos padrões comportamentais dos elefantes na REM (i. é. Inverno Vs. Verão). A avaliação dos modelos foi feita recorrendo à área sob as curvas ROC. Visto que o grau de incerteza associado á sua produção é múltiplo, e que a capacidade de calibração dos mesmos não se revelou melhor que a hipótese nula (ou seja, não haverá diferença entre os valores observados e os estimados pelo modelo). Verifica-se que, ambos os modelos classificam adequadamente os locais (habitats) que devem ser prioritários do ponto de vista da gestão e conservação, tendo como base de arbitragem o valor sob a curva ROC, indicativo da fiabilidade dos valores probabilísticos obtidos. Para ambos os modelos esses valores consideram-se “excelentes”: 0.863 (IC95%: 0.822 – 0.904) e 0.830 (IC95%: 0.781 - 0.878), respectivamente para o modelo de Verão e Inverno
Key Phrase Extraction of Lightly Filtered Broadcast News
This paper explores the impact of light filtering on automatic key phrase
extraction (AKE) applied to Broadcast News (BN). Key phrases are words and
expressions that best characterize the content of a document. Key phrases are
often used to index the document or as features in further processing. This
makes improvements in AKE accuracy particularly important. We hypothesized that
filtering out marginally relevant sentences from a document would improve AKE
accuracy. Our experiments confirmed this hypothesis. Elimination of as little
as 10% of the document sentences lead to a 2% improvement in AKE precision and
recall. AKE is built over MAUI toolkit that follows a supervised learning
approach. We trained and tested our AKE method on a gold standard made of 8 BN
programs containing 110 manually annotated news stories. The experiments were
conducted within a Multimedia Monitoring Solution (MMS) system for TV and radio
news/programs, running daily, and monitoring 12 TV and 4 radio channels.Comment: In 15th International Conference on Text, Speech and Dialogue (TSD
2012
Understanding the relationship between illness perceptions of breast cancer and perceived risk in a sample of U.A.E. female university students: the role of comparative risk
BMC Women's Health is an open access, peer-reviewed journal that considers articles on all aspects of the health and wellbeing of adolescent girls and women, with a particular focus on the physical, mental, and emotional health of women in developed and developing nations. The journal welcomes submissions on women's public health issues, health behaviours, breast cancer, gynecological diseases, mental health and health promotion.info:eu-repo/semantics/publishedVersio
- …